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Overview  of  Scientific  Progress.  “Head-related  transfer  functions”  (HRTFs)  capture  the 
direction-dependent  filter  characteristics  of  the  external  ears.  When  a  sound  is  filtered  by 
HRTFs  measured  from  a  listener’s  own  ears  and  played  over  headphones,  the  listener  hears  a 
virtual  source  that  is  well  localized  in  space.  When  sounds  were  filtered  by  other  listeners’ 
HRTFs,  listeners  showed  fairly  accurate  localization  in  the  lateral  dimension  but  showed 
conspicuous  vertical  and  front/back  errors.  We  examined  differences  among  HRTFs  measured 
from  45  listeners.  We  quantified  differences  by  subtracting  HRTFs  between  listeners  for 
corresponding  locations,  then  computing  the  variance  of  the  resulting  difference  spectra  across 
393  locations.  Inter-listener  differences  could  be  reduced  by  shifting  HRTFs  scaling  in 
frequency.  Optimal  scalars  reduced  variances  by  an  average  of  20.2%  across  all  pairs  of 
listeners  and  by  more  than  50%  in  9.5%  of  listener  pairs.  The  optimal  scalar  for  any  pair  of 
listeners  correlated  highly  with  the  relative  sizes  of  certain  physical  dimensions.  When  HRTFs 
were  shifted  optimally  then  used  in  virtual  localization  trials,  all  measures  of  virtual  localization 
performance  tended  to  improve.  In  the  majority  of  cases,  the  performance  penalty  for  use  of 
HRTFs  from  another  listener  was  reduced  by  more  than  half. 

Description  of  Scientific  Accomplishments.  The  principal  objective  of  this  research  project  was 
to  develop  techniques  by  which  head-related  transfer  functions  (HRTFs)  can  be  processed  for 
inter-listener  differences,  thus  making  virtual  auditory  environment  technology  accessible  to  a 
broad  range  of  listeners.  The  major  classes  of  results  are  that  individual  differences  can  be 
reduced  by  more  than  half  by  scaling  HRTFs  in  frequency  and  that  frequency  scaling  of  HRTFs 
result  in  substantial  improvements  in  location  of  virtual  targets  under  conditions  in  which 
listeners  use  HRTFs  recorded  from  other  listeners.  Specific  accomplishments  are  described 
below: 

•  We  recorded  HRTFs  from  45  human  listeners.  Each  set  of  HRTFs  consisted  of  recordings 
from  left  and  right  ears  for  sound  sources  at  400  locations  in  azimuth  and  elevation.  For  the 
purpose  of  quantitative  evaluation,  each  HRTF  was  processed  with  a  bank  of  85  narrowband 
filters,  and  a  spatial  interpolation  procedure  was  used  to  obtain  processed  HRTFs  at  393 
locations,  each  representing  a  constant  solid  angle. 

•  We  devised  a  metric  to  represent  the  differences  between  sets  of  HRTFs  for  each  pair  of 
listeners.  Log  magnitudes  in  dB  of  HRTFs  for  each  source  location  were  subtracted,  one 
subject  from  the  other,  across  a  frequency  band  from  3.7  to  12.9  kHz,  then  the  variances  of 
the  resulting  difference  spectra  were  computed.  Inter-listener  variances  for  each  pair  of 
subjects  were  averaged  across  all  393  locations. 

•  Visual  inspection  of  HRTFs  indicated  that  HRTFs  from  various  listeners  appeared  to  vary 
systematically  in  the  position  of  spectral  features  (i.e.,  peaks  and  notchs)  along  the  frequency 
axis.  We  found  that  inter-listener  variances  could  be  reduced  by  scaling  HRTFs  in  frequency 
(or,  equivalently,  shifting  in  octave  frequency).  Scaling  by  optimal  values  reduced  variances 
by  an  average  of  20.2%  across  all  990  pair-wise  combinations  of  45  listeners  and  by  more 
than  half  in  9.5%  of  listener  pairs.  The  mean  magnitude  of  the  optimal  octave  frequency 
shift  across  all  pairs  of  subjects  was  0.107  octave  (i.e.,  a  scalar  of  107.7%). 
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•  Optimal  scalars  between  pairs  of  listeners  correlated  with  a  measure  of  the  ratios  of  their 
maximum  interaural  time  differences  (r=  .68). 

•  Optimal  scalars  correlated  with  certain  physical  dimensions.  We  measured  the  widths  of 
listeners’  heads  and  several  dimensions  of  their  external  ears.  The  highest  correlation  was  a 
correlation  of  r=  .82  between  octave  frequency  scalars  and  a  weighted  sum  of  the  width  of 
the  head  and  the  height  of  the  pinna  (from  inter-tragal  notch  to  the  helix). 

•  We  tested  the  accuracy  of  localization  of  virtual  targets  under  conditions  in  which  listeners 
used  HRTFs  recorded  from  their  own  ears  (the  own-ear  condition),  in  which  HRTFs  were 
recorded  from  other  listeners  (the  other-ear  condition),  and  in  which  HRTFs  from  other 
listeners  were  scaled  optimally  in  frequency  (the  scaled-ear  condition).  Fourteen  subjects 
were  tested  in  the  own-ear  condition  and  in  a  total  of  58  cases  in  the  other-ear  condition. 
Eleven  of  the  listeners  also  were  tested  in  the  scaled-ear  condition  in  a  total  of  21  cases. 

•  In  the  other-ear  condition,  listeners  localized  fairly  accurately  in  the  lateral  dimension, 
although  the  smallest  listeners  (i.e.,  those  with  short  interaural  delays)  tended  to  overshoot 
the  targets  when  listening  through  HRTFs  measured  from  larger  listeners.  Similarly,  large 
listeners  tended  to  undershoot  when  listening  through  HRTFs  measured  from  small  listeners. 

•  Listeners  tended  to  show  conspicuous  localization  errors  in  the  vertical  and  front/back 
dimensions.  These  errors  included  systematic  errors  in  the  vertical  dimension  within  a 
particular  quadrant  and  “quadrant  errors”  in  which  targets  were  mislocalized  to  the  wrong 
down-front,  up-front,  up-rear,  or  down-rear  quadrant. 

•  In  cases  in  which  small  listeners  used  HRTFs  from  larger  listeners:  1)  down-front  targets 
often  were  localized  to  up-front;  2)  up-rear  targets  often  were  localized  to  up-front;  and  3) 
fronfrback  confusions  were  more  common  than  up/down  confusions. 

•  In  cases  in  which  large  listeners  used  HRTFs  from  smaller  listeners:  1)  there  was  a 
substantial  rate  of  ffonPback  confusions,  although  the  rate  was  lower  than  in  small- 
listener/large-HRTF  cases;  and  2)  in  trials  in  which  there  was  no  quadrant  error,  there  was  a 
systematic  upward  bias  in  location  judgements. 

•  The  magnitude  of  virtual-localization  errors  and  the  rate  of  quadrant  errors  increased  in 
proportion  to  the  inter-listener  variance  between  a  listener’s  own  HRTFs  and  the  HRTFs 
through  which  he  or  she  listened. 

•  Nearly  every  quantitative  measure  of  localization  accuracy  improved  when  the  inter-listener 
variance  between  HRTFs  was  reduced  by  scaling  in  frequency.  In  62%  of  cases,  the  increase 
in  vertical  and  front/back  error  in  the  other-ear  compared  to  the  own-ear  condition  was 
reduced  by  more  than  half  by  scaling  in  frequency. 
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